High-Quality Prosody Generation in Mandarin Text-to-Speech System

نویسندگان

  • Qing Guo
  • Jie Zhang
  • Nobuyuki Katae
  • Hao Yu
چکیده

A text-to-speech (TTS) synthesizer is a computer-based system that can automatically read text aloud. Fujitsu is developing a Mandarin TTS system using state-of-the-art technologies. The prosodic structure of synthesized text provides important information for making synthetic speech produced by a TTS system more natural and understandable. This paper describes a global probability estimation method for predicting prosodic words, which are the lowest constituent of the prosodic structure. Experimental results for this method are very promising. They are better than those for our previous binary prosodic tree method in terms of both accuracy and memory cost.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modular Design for Mandarin Text-to-speech Synthesis

In the European Union funded project Technology and Corpora for Speech-to-Speech Translation (TC-STAR) [3], we have developed a modular concatenative TTS system for Mandarin Chinese. A common architecture has been introduced based on well-defined modules and interfaces. Three main modules, text processing, prosody processing and acoustic synthesis modules, are used following a commonly employed...

متن کامل

An Example-based Approach for Prosody Generation in Chinese Speech Synthesis

Prosody generation is an important issue in text to speech system. We present in this paper an example-based approach for prosody generation in mandarin Chinese speech synthesis. The general idea is that we are trying to get the prosodic information from real speech examples. We first analyze given Chinese text, and form a linguistic feature vector, which describes the phonetic and lexicon char...

متن کامل

Unsupervised prosody labeling for constructing Mandarin TTS

This paper introduces an unsupervised prosody labeling method for preparing a large speech corpus used in developing a Mandarin Text-to-Speech system. Adopting a four-layer prosody hierarchy, the proposed method performs an unsupervised segmental clustering that iteratively segments spoken utterances into strings of prosodic constituents and models the patterns of the segmented prosodic constit...

متن کامل

On Cross-Dialect and -Speaker Adaptation of Speaking Rate-Dependent Hierarchical Prosodic Model for a Hakka Text-to-Speech System

This paper presents an effective adaptation of an existing speaking rate-dependent hierarchical prosodic model (SRHPM) for Mandarin to construct the SR-HPM for Hakka, another Chinese dialect. Based on the cross-dialectal linguistic similarities in terms of syntactic and prosodic structures, the adaptation is formulated as a maximum a posteriori estimation (MAP) problem with the existing Mandari...

متن کامل

Automatic Prosody Generation in a Text-to-speech System for Hebrew

The paper presents the module for automatic prosody generation within a system for automatic synthesis of high-quality speech based on arbitrary text in Hebrew. The high quality of synthesis is due to the high accuracy of automatic prosody generation, enabling the introduction of elements of natural sentence prosody of Hebrew. Automatic morphological annotation of text is based on the applicati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010